Tagging Inflective Languages: Prediction of Morphological Categories for a Rich, Structured Tagset
نویسندگان
چکیده
p u r p o s e s , i t h a s b e e n t a g g e d b y o u r t a g g e r ; e r r o r s a r e p r i n t e d u n d e r l i n e d a n d c o r r e c t i o n s a r e s h o w n . } Hlavnfm/AAIS7 .... IA-probl4mem/NNIS7 ..... A--
منابع مشابه
Morphological Tagging: Data vs. Dictionaries
Part of Speech tagging for English seems to have reached the the human levels of error, but full morphological tagging for inflectionally rich languages, such as Romanian, Czech, or Hungarian, is still an open problem, and the results are far from being satisfactory. This paper presents results obtained by using a universalized exponential feature-based model for five such languages. It focuses...
متن کاملThe Linguistics Journal Volume 4 Issue 1 the First Paper on " Part-of-speech Tagging for Grammar Checking of Punjabi " Part-of-speech Tagging for Grammar Checking of Punjabi Noun and Modifier Agreement
Part-of-speech (POS) tagging is one of the major activities performed in a typical natural language processing application. This paper explores part-of-speech tagging for the Punjabi language, a member of the Modern Indo-Aryan family of languages. A tagset for use in grammar checking and other similar applications is proposed. This fine-grained tagset is based entirely on the grammatical catego...
متن کاملTigrinya Part-of-Speech Tagging with Morphological Patterns and the New Nagaoka Tigrinya CorpusTigrinya Part-of-Speech Tagging with Morphological Patterns and the New Nagaoka Tigrinya Corpus
This paper presents the first part-of-speech (POS) tagging research for Tigrinya (Semitic language) from the newly constructed Nagaoka Tigrinya Corpus. The raw text was extracted from a newspaper published in Eritrea in the Tigrinya language. This initial corpus was cleaned and formatted in plaintext and the Text Encoding Initiative (TEI) XML format. A tagset of 73 tags was designed, and the co...
متن کاملAutomatic Morphological Analysis for Russian: a Comparative Study
In this paper we present a comparison of ten systems for automatic morphological analysis: TreeTagger, TnT, HunPos, Lapos, Citar, Morfette, Mystem, Pymorhy, Stanford POS tagger and SVMTool. Different training and tagging approaches are discussed together with the strengths and weaknesses of each system. Probabilistic taggers were trained and tested on the Russian National Disambiguated Corpus a...
متن کاملA Positional Tagset for Russian
Fusional languages have rich inflection. As a consequence, tagsets capturing their morphological features are necessarily large. A natural way to make a tagset manageable is to use a structured system. In this paper, we present a positional tagset for describing morphological properties of Russian. The tagset was inspired by the Czech positional system (Hajič, 2004). We have used preliminary ve...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998